SD3.5 Large support #1719

kohya-ss · 2024-10-24T10:50:57Z

SD3 Medium fine tuning works.

kohya-ss · 2024-10-24T13:05:10Z

SD3.5L training seems to work now. --disable_mmap_load_safetensors is recommended for Windows for faster safetensors loading.

Implemented block swap. With 30 blocks, training may be possible on an SD3.5L with 12GB VRAM.

Random drop out of Text Encoder embedding is not implemented yet. LoRA training is not implemented yet.

FurkanGozukara · 2024-10-25T00:15:02Z

awesome can't wait to test it once matured

kohya-ss · 2024-10-25T13:00:49Z

Basic LoRA training (MMDiT only) may work now.

kohya-ss · 2024-10-25T14:24:01Z

Fix latent scaling/shifting.

…ers LoRA not trained

kohya-ss · 2024-10-27T10:43:31Z

--clip_l_dropout_rate, --clip_g_dropout_rate and --t5_dropout_rate options have been added to sd3_train.py and sd3_train_network.py Each of these options will set the output of the corresponding Text Encoder to 0 with the specified probability. The default value is 0 (no dropout). The values in the SAI technical report are 0.463 (46.3%), respectively, but the optimal values seem to vary depending on the dataset.

Training for LoRA including the Text Encoder should now work correctly. Each of the dropout options will work for LoRA training, but the optimal value is unknown.

bghira · 2024-10-27T13:25:42Z

L and G aren't dropped out separately. can't drop out all of clip either due to it being used for adanorm

kohya-ss · 2024-10-27T14:43:47Z

L and G aren't dropped out separately. can't drop out all of clip either due to it being used for adanorm

Thank you! Hmm, Appendix B.3 of their technical paper http://arxiv.org/pdf/2403.03206 states the following:

For unconditional diffusion guidance (Ho & Salimans, 2022), we set the outputs of each of the three text encoders independently to zero with a probability of 46.4%, such that we roughly train an unconditional model in 10% of all steps.

So I think it makes sense to drop it separately.

FurkanGozukara · 2024-10-27T14:48:31Z

@kohya-ss what is the purpose of dropout? so when SAI training they almost dropped out 50% of the times each one randomly?

so with dropout it is training only U-NET part is that meaning this?

bghira · 2024-10-27T15:02:25Z

this kind of dropout is for pretraining, not for finetuning. for small datasets you will merely harm the model to drop out the text encoders so much. it will overcondition the uncond space.

waomodder · 2024-10-28T10:00:54Z

@kohya-ss
ここでご報告するのもあれですが、ちゃんとSD3.5largeのLoraが学習できてました！

FP8BASEを使用し、learning_rateは1e-4でDim64、alpha 32.0の設定で画像数は約8000枚で24000ステップ回しています。
Loss値は0.1付近で安定して収束していました。

kohya-ss · 2024-10-28T11:33:15Z

@kohya-ss what is the purpose of dropout? so when SAI training they almost dropped out 50% of the times each one randomly?

I don't know the details either, so please see the technical paper: https://arxiv.org/pdf/2403.03206.

kohya-ss · 2024-10-28T11:34:10Z

ここでご報告するのもあれですが、ちゃんとSD3.5largeのLoraが学習できてました！

無事に動作したようで幸いです。I'm glad it seems to have worked fine!

waomodder · 2024-10-29T09:20:46Z

@Bocchi-Chan2023
オプティマイザーはAdamW8bitを使用し、スケジューラーはcosineとなってます。

mliand · 2024-10-29T10:41:42Z

INFO clip_l is not included in the checkpoint and clip_l_path is not provided sd3_utils.py:117│
INFO clip_g is not included in the checkpoint and clip_g_path is not provided sd3_utils.py:177│
INFO t5xxl is not included in the checkpoint and t5xxl_path is not provided sd3_utils.py:232│
The text encoder specified path parameter could not be found when training with the toml configuration file

kohya-ss · 2024-10-29T12:07:43Z

The text encoder specified path parameter could not be found when training with the toml configuration file

Please specify the respective weight files downloaded from HuggingFace for each option: clip_l, clip_g, t5xxl.

…into sd3_5_support

kohya-ss · 2024-10-29T14:31:35Z

split_qkv, train_block_indices and emb_dims network_args should work for LoRA training.

…into sd3_5_support

kohya-ss · 2024-10-30T03:55:27Z

Added SD3.5M support.

--pos_emb_random_crop_rate option is added for sd3_train.py and sd3_train_network.py. This option specifies the probability of random crop augmentation as described on the model card: https://huggingface.co/stabilityai/stable-diffusion-3.5-medium.

0 means no random crop, 1 means always. The default is 0.

nephi-dev · 2024-10-30T04:18:24Z

can't load the VAE, either 3.5 and 3, i'm getting an Missing key error

nephi-dev · 2024-10-30T04:27:35Z

can't load the VAE, either 3.5 and 3, i'm getting an Missing key error

actully, i've managed to extract the VAE directly from sd3 and it did work now, maybe I was using an wrong one

kohya-ss · 2024-10-30T12:15:23Z

can't load the VAE, either 3.5 and 3, i'm getting an Missing key error

SAI's SD3.5L/M checkpoint seems to have VAE built in. Please omit the --vae option.

nephi-dev · 2024-10-30T15:26:05Z

do you have an minimum params config to train an LoRA on sd35M? i've tried with some configs(that i've used for flux), yet no results at all

kohya-ss · 2024-10-31T10:59:35Z

do you have an minimum params config to train an LoRA on sd35M? i've tried with some configs(that i've used for flux), yet no results at all

README.md is updated.

kohya-ss · 2024-10-31T11:02:42Z

Supported SD3.5M multi-resolution learning. The feature has not yet been fully tested, so please let us know if you find any issues.

The specifications of the latent cache have changed, so please delete the previous cache files (it works but garbage will remain in the file).

The idea and code for positional embedding of SD3.5M was contributed by KBlueLeaf. Thank you KBlueLeaf!

kohya-ss · 2024-11-01T12:48:20Z

Fixed a memory leak when caching latents. This does not affect data that has already been cached.

Images were not being discarded after latent conversion.

kohya-ss · 2024-11-01T12:54:53Z

As the main functionality appears to be working, I'll proceed with merging this branch. Thank you for your significant contributions.

edit: This branch will be removed in the next few days.

FurkanGozukara · 2024-11-03T00:42:34Z

--disable_mmap_load_safetensors

why is this not default? what is the reason not making it default? thank you

for example on runpod machines, model loading when training painfully slow, loading FLUX, can it help there too?

kohya-ss added 4 commits October 24, 2024 19:49

refactor SD3 CLIP to transformers etc.

623017f

reduce memory usage in sample image generation

e3c43bd

support SD3.5L, fix final saving

0286114

support block swap with fused_optimizer_pass

f8c5146

kohya-ss mentioned this pull request Oct 24, 2024

support SD3 #1374

Draft

25 tasks

kohya-ss added 2 commits October 25, 2024 19:03

Merge branch 'sd3' into sd3_5_support

f52fb66

support SD3 LoRA

d2c549d

add latent scaling/shifting

0031d91

kohya-ss added 8 commits October 26, 2024 17:29

fix errors in SD3 LoRA training with Text Encoders close #1724

56bf761

fix sample image generation without seed failed close #1726

014064f

Merge branch 'sd3' into sd3_5_support

150579d

Merge branch 'sd3' into sd3_5_support

b649bbf

Add dropout rate arguments for CLIP-L, CLIP-G, and T5, fix Text Encod…

db2b4d4

…ers LoRA not trained

Fix SD3 LoRA training to work (WIP)

a1255d6

prevent unintended cast for disk cached TE outputs

d4f7849

Fix to work dropout_rate for TEs

1065dd1

kohya-ss mentioned this pull request Oct 27, 2024

Error when train sd35 lora #1725

Closed

Fix sample image gen to work with block swap

af8e216

kohya-ss added 6 commits October 29, 2024 21:51

Fix split_qkv

0af4edd

Support Lora

d4e19fb

Merge branch 'sd3_5_support' of https://github.com/kohya-ss/sd-scripts …

80bb3f4

…into sd3_5_support

Fix additional LoRA to work

ce5b532

Merge branch 'sd3' into sd3_5_support

c9a1417

Fix emb_dim to work.

b502f58

kohya-ss added 2 commits October 30, 2024 12:51

support SD3.5M

bdddc20

Merge branch 'sd3_5_support' of https://github.com/kohya-ss/sd-scripts …

8c3c825

…into sd3_5_support

Fix to use SDPA instead of xformers

70a179e

kohya-ss added 2 commits October 31, 2024 19:58

Support SD3.5M multi resolutional training

1434d85

Update SD3 training

9e23368

Fix crashing if image is too tall or wide.

830df4a

kohya-ss marked this pull request as ready for review November 1, 2024 10:05

kohya-ss added 2 commits November 1, 2024 21:43

Fix memory leak in latent caching. bmp failed to cache

9aa6f52

remove duplicate resolution for scaled pos embed

82daa98

kohya-ss merged commit 264328d into sd3 Nov 1, 2024
2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SD3.5 Large support #1719

SD3.5 Large support #1719

kohya-ss commented Oct 24, 2024

kohya-ss commented Oct 24, 2024

FurkanGozukara commented Oct 25, 2024

kohya-ss commented Oct 25, 2024

kohya-ss commented Oct 25, 2024

kohya-ss commented Oct 27, 2024

bghira commented Oct 27, 2024

kohya-ss commented Oct 27, 2024

FurkanGozukara commented Oct 27, 2024

bghira commented Oct 27, 2024

waomodder commented Oct 28, 2024

kohya-ss commented Oct 28, 2024

kohya-ss commented Oct 28, 2024

waomodder commented Oct 29, 2024

mliand commented Oct 29, 2024

kohya-ss commented Oct 29, 2024

kohya-ss commented Oct 29, 2024

kohya-ss commented Oct 30, 2024

nephi-dev commented Oct 30, 2024

nephi-dev commented Oct 30, 2024

kohya-ss commented Oct 30, 2024

nephi-dev commented Oct 30, 2024

kohya-ss commented Oct 31, 2024

kohya-ss commented Oct 31, 2024

kohya-ss commented Nov 1, 2024

kohya-ss commented Nov 1, 2024 •

edited

Loading

FurkanGozukara commented Nov 3, 2024 •

edited

Loading

SD3.5 Large support #1719

SD3.5 Large support #1719

Conversation

kohya-ss commented Oct 24, 2024

kohya-ss commented Oct 24, 2024

FurkanGozukara commented Oct 25, 2024

kohya-ss commented Oct 25, 2024

kohya-ss commented Oct 25, 2024

kohya-ss commented Oct 27, 2024

bghira commented Oct 27, 2024

kohya-ss commented Oct 27, 2024

FurkanGozukara commented Oct 27, 2024

bghira commented Oct 27, 2024

waomodder commented Oct 28, 2024

kohya-ss commented Oct 28, 2024

kohya-ss commented Oct 28, 2024

waomodder commented Oct 29, 2024

mliand commented Oct 29, 2024

kohya-ss commented Oct 29, 2024

kohya-ss commented Oct 29, 2024

kohya-ss commented Oct 30, 2024

nephi-dev commented Oct 30, 2024

nephi-dev commented Oct 30, 2024

kohya-ss commented Oct 30, 2024

nephi-dev commented Oct 30, 2024

kohya-ss commented Oct 31, 2024

kohya-ss commented Oct 31, 2024

kohya-ss commented Nov 1, 2024

kohya-ss commented Nov 1, 2024 • edited Loading

FurkanGozukara commented Nov 3, 2024 • edited Loading

kohya-ss commented Nov 1, 2024 •

edited

Loading

FurkanGozukara commented Nov 3, 2024 •

edited

Loading